Artificial Intelligence Using Multimodal Data for Myelodysplastic Syndrome Screening

De almeida Braga, Cedric; Bauvais, Maxence; Dominici, Maxence; Garnier, Alice; Peterlin, Pierre; Debord, Camille; Wuillème, Soraya; Theisen, Olivier; Godon, Catherine; Bouzy, Simon; Le Bris, Yannick; Bene, Marie C; Chevallier, Patrice; Normand, Nicolas; Eveillard, Marion

doi:10.1182/blood-2022-166979

Abstract

Introduction Myelodysplastic syndrome (MDS) is a clonal pathology affecting hematopoietic stem cells, which results in peripheral blood cytopenia and dysplasia. MDS diagnosis is currently performed by a screening step using complete blood count differential (CBC + DIFF) and bone marrow smears morphologic examination, followed by cytogenetics and molecular analyses. While the CBC + DIFF is widely automated today, the screening process is still mainly performed by morphology experts.

This type of manual task is tedious, requires specialized knowledge to be conducted and is often error-prone. Indeed, physicians could benefit from the deployment of dedicated computer-aided-diagnosis (CAD) tools. Artificial intelligence, and mainly deep learning, has lately demonstrated very good results in this field, with the multiplication of applications in healthcare in the past few years. Said applications are e.g. related to medical image analysis for the diagnosis of diverse pathologies. Regarding MDS in particular, previous studies have been interested in the detection of the syndrome at the screening step, using CBC + DIFF numerical values (Boutault et al., BJH 2018).

In this work we propose an approach using deep learning to leverage both numerical CBC + DIFF parameters and image analysis for the detection of MDS patients at the screening step.

Patients, material and methods In order to create a CAD tool for MDS detection, a deep learning algorithm called convolutional neural network (CNN) was trained according to 2 successive modalities on 2 patient cohorts. For each of them, CBC+DIFF were performed automatically using a Sysmex XN10® analyzer, and images of neutrophils (PMN) (CellaVision®) were collected. All in all, 60 patients and 14,579 PMN images were used.

The first cohort was composed of healthy control patients (n=510 images) and confirmed MDS patients (n=672 images). Data augmentation was then performed and a CNN (Figure 1 A.) was trained to discriminate dysplastic neutrophils (n=5,416) from normal ones (n=4,040). The second cohort comprised healthy subjects (1,462 images), MDS patients (1,220 images) and patients suffering from other diseases (1,259 images). For this cohort, CBC + DIFF parameters were collected in addition to neutrophil images. The first CNN pipeline was extended, and visual features from all neutrophil images for a given patient were extracted and fused with informative CBC + DIFF parameters as mentioned above (Boutault et al.). This extended CNN pipeline was aimed at regressing a score [0,1] indicating the probability of the patient to suffer from MDS (Figure 1 B.).

For the first cohort, the dataset was split in training, validation and testing sets, the test set being composed of images from previously unseen patients for evaluation. The extended MDS prediction pipeline was then trained on the second cohort using a stratified 5-fold cross-validation.

Results The pre-training task using only the CNN on neutrophil images resulted in a ROC-area under curve (AUC) of 0.983 with 92.9% accuracy. This CNN can hence be used as is to estimate the rate of dysplastic neutrophils for each patient.

The extended pipeline for the MDS probability score regression gets an average accuracy across folds of 80.8% for diagnosis prediction. These results outperform state-of-the art methods using only CBC+DIFF parameters.

Conclusion A new CAD tool based on deep learning is presented. It leverages the use of multimodal data in order to improve results from previous methods in predicting the presence of MDS in a patient at the screening stage. The next step of this work will be to perform a multicenter study, to further validate the pipeline's robustness.

Among the promising leads for the improvement of the pipeline, designing algorithms to produce synthetic multi-modal data upon which training could be performed will be investigated. It could also be interesting to devise a semi-automatic classification framework enabling the algorithm to flag the cases where uncertainty remains high in the prediction.

Figure 1

View large Download PPT

Disclosures

Chevallier:Incyte: Research Funding; Takeda: Honoraria; Jazz Pharmaceuticals: Honoraria; Pfizer: Research Funding; Abbvie: Honoraria.

Author notes

*

Asterisk with author names denotes non-ASH members.

2022

Sign in via your Institution

Artificial Intelligence Using Multimodal Data for Myelodysplastic Syndrome Screening

Abstract

Disclosures

Author notes

Cited By

Email alerts

ASH Publications

American Society of Hematology

Artificial Intelligence Using Multimodal Data for Myelodysplastic Syndrome Screening Free

Abstract

Disclosures

Author notes

This feature is available to Subscribers Only

My Account

Cited By

Email alerts

ASH Publications

American Society of Hematology

This Feature Is Available To Subscribers Only

Artificial Intelligence Using Multimodal Data for Myelodysplastic Syndrome Screening